Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 18 de 18
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Sci Rep ; 13(1): 8243, 2023 05 22.
Artigo em Inglês | MEDLINE | ID: mdl-37217589

RESUMO

Vaccine discovery against eukaryotic parasites is not trivial as highlighted by the limited number of known vaccines compared to the number of protozoal diseases that need one. Only three of 17 priority diseases have commercial vaccines. Live and attenuated vaccines have proved to be more effective than subunit vaccines but adversely pose more unacceptable risks. One promising approach for subunit vaccines is in silico vaccine discovery, which predicts protein vaccine candidates given thousands of target organism protein sequences. This approach, nonetheless, is an overarching concept with no standardised guidebook on implementation. No known subunit vaccines against protozoan parasites exist as a result of this approach, and consequently none to emulate. The study goal was to combine current in silico discovery knowledge specific to protozoan parasites and develop a workflow representing a state-of-the-art approach. This approach reflectively integrates a parasite's biology, a host's immune system defences, and importantly, bioinformatics programs needed to predict vaccine candidates. To demonstrate the workflow effectiveness, every Toxoplasma gondii protein was ranked in its capacity to provide long-term protective immunity. Although testing in animal models is required to validate these predictions, most of the top ranked candidates are supported by publications reinforcing our confidence in the approach.


Assuntos
Parasitos , Vacinas Protozoárias , Toxoplasma , Vacinas de DNA , Animais , Camundongos , Proteínas , Vacinas de Subunidades Antigênicas , Proteínas de Protozoários/genética , Anticorpos Antiprotozoários , Antígenos de Protozoários/genética , Camundongos Endogâmicos BALB C
2.
FEMS Microbiol Rev ; 47(2)2023 03 10.
Artigo em Inglês | MEDLINE | ID: mdl-36806618

RESUMO

Reverse vaccinology (RV) was described at its inception in 2000 as an in silico process that starts from the genomic sequence of the pathogen and ends with a list of potential protein and/or peptide candidates to be experimentally validated for vaccine development. Twenty-two years later, this process has evolved from a few steps entailing a handful of bioinformatics tools to a multitude of steps with a plethora of tools. Other in silico related processes with overlapping workflow steps have also emerged with terms such as subtractive proteomics, computational vaccinology, and immunoinformatics. From the perspective of a new RV practitioner, determining the appropriate workflow steps and bioinformatics tools can be a time consuming and overwhelming task, given the number of choices. This review presents the current understanding of RV and its usage in the research community as determined by a comprehensive survey of scientific papers published in the last seven years. We believe the current mainstream workflow steps and tools presented here will be a valuable guideline for all researchers wanting to apply an up-to-date in silico vaccine discovery process.


Assuntos
Vacinas , Vacinologia , Vacinologia/métodos , Genômica/métodos , Biologia Computacional/métodos , Proteômica/métodos
3.
Sci Rep ; 12(1): 10349, 2022 06 20.
Artigo em Inglês | MEDLINE | ID: mdl-35725870

RESUMO

The World Health Organisation reported in 2020 that six of the top 10 sources of death in low-income countries are parasites. Parasites are microorganisms in a relationship with a larger organism, the host. They acquire all benefits at the host's expense. A disease develops if the parasitic infection disrupts normal functioning of the host. This disruption can range from mild to severe, including death. Humans and livestock continue to be challenged by established and emerging infectious disease threats. Vaccination is the most efficient tool for preventing current and future threats. Immunogenic proteins sourced from the disease-causing parasite are worthwhile vaccine components (subunits) due to reliable safety and manufacturing capacity. Publications with 'subunit vaccine' in their title have accumulated to thousands over the last three decades. However, there are possibly thousands more reporting immunogenicity results without mentioning 'subunit' and/or 'vaccine'. The exact number is unclear given the non-standardised keywords in publications. The study aim is to identify parasite proteins that induce a protective response in an animal model as reported in the scientific literature within the last 30 years using machine learning and natural language processing. Source code to fulfil this aim and the vaccine candidate list obtained is made available.


Assuntos
Parasitos , Doenças Parasitárias , Vacinas , Animais , Aprendizado de Máquina , Processamento de Linguagem Natural
4.
Front Genet ; 12: 716132, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34367264

RESUMO

Bovine babesiosis causes significant annual global economic loss in the beef and dairy cattle industry. It is a disease instigated from infection of red blood cells by haemoprotozoan parasites of the genus Babesia in the phylum Apicomplexa. Principal species are Babesia bovis, Babesia bigemina, and Babesia divergens. There is no subunit vaccine. Potential therapeutic targets against babesiosis include members of the exportome. This study investigates the novel use of protein secondary structure characteristics and machine learning algorithms to predict exportome membership probabilities. The premise of the approach is to detect characteristic differences that can help classify one protein type from another. Structural properties such as a protein's local conformational classification states, backbone torsion angles ϕ (phi) and ψ (psi), solvent-accessible surface area, contact number, and half-sphere exposure are explored here as potential distinguishing protein characteristics. The presented methods that exploit these structural properties via machine learning are shown to have the capacity to detect exportome from non-exportome Babesia bovis proteins with an 86-92% accuracy (based on 10-fold cross validation and independent testing). These methods are encapsulated in freely available Linux pipelines setup for automated, high-throughput processing. Furthermore, proposed therapeutic candidates for laboratory investigation are provided for B. bovis, B. bigemina, and two other haemoprotozoan species, Babesia canis, and Plasmodium falciparum.

5.
Pathogens ; 10(6)2021 May 27.
Artigo em Inglês | MEDLINE | ID: mdl-34071992

RESUMO

Babesia infection of red blood cells can cause a severe disease called babesiosis in susceptible hosts. Bovine babesiosis causes global economic loss to the beef and dairy cattle industries, and canine babesiosis is considered a clinically significant disease. Potential therapeutic targets against bovine and canine babesiosis include members of the exportome, i.e., those proteins exported from the parasite into the host red blood cell. We developed three machine learning-derived methods (two novel and one adapted) to predict for every known Babesia bovis, Babesia bigemina, and Babesia canis protein the probability of being an exportome member. Two well-studied apicomplexan-related species, Plasmodium falciparum and Toxoplasma gondii, with extensive experimental evidence on their exportome or excreted/secreted proteins were used as important benchmarks for the three methods. Based on 10-fold cross validation and multiple train-validation-test splits of training data, we expect that over 90% of the predicted probabilities accurately provide a secretory or non-secretory indicator. Only laboratory testing can verify that predicted high exportome membership probabilities are creditable exportome indicators. However, the presented methods at least provide those proteins most worthy of laboratory validation and will ultimately save time and money.

6.
FEMS Microbiol Rev ; 45(5)2021 09 08.
Artigo em Inglês | MEDLINE | ID: mdl-33724378

RESUMO

To understand the intricacies of microorganisms at the molecular level requires making sense of copious volumes of data such that it may now be humanly impossible to detect insightful data patterns without an artificial intelligence application called machine learning. Applying machine learning to address biological problems is expected to grow at an unprecedented rate, yet it is perceived by the uninitiated as a mysterious and daunting entity entrusted to the domain of mathematicians and computer scientists. The aim of this review is to identify key points required to start the journey of becoming an effective machine learning practitioner. These key points are further reinforced with an evaluation of how machine learning has been applied so far in a broad scope of real-life microbiology examples. This includes predicting drug targets or vaccine candidates, diagnosing microorganisms causing infectious diseases, classifying drug resistance against antimicrobial medicines, predicting disease outbreaks and exploring microbial interactions. Our hope is to inspire microbiologists and other related researchers to join the emerging machine learning revolution.


Assuntos
Inteligência Artificial , Aprendizado de Máquina
7.
Methods Mol Biol ; 2183: 29-42, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-32959239

RESUMO

Bioinformatics programs have been developed that exploit informative signals encoded within protein sequences to predict protein characteristics. Unfortunately, there is no program as yet that can predict whether a protein will induce a protective immune response to a pathogen. Nonetheless, predicting those pathogen proteins most likely from those least likely to induce an immune response is feasible when collectively using predicted protein characteristics. Vacceed is a computational pipeline that manages different standalone bioinformatics programs to predict various protein characteristics, which offer supporting evidence on whether a protein is secreted or membrane -associated. A set of machine learning algorithms predicts the most likely pathogen proteins to induce an immune response given the supporting evidence. This chapter provides step by step descriptions of how to configure and operate Vacceed for a eukaryotic pathogen of the user's choice.


Assuntos
Antígenos/imunologia , Biologia Computacional/métodos , Mapeamento de Epitopos/métodos , Eucariotos/imunologia , Interações Hospedeiro-Patógeno/imunologia , Software , Algoritmos
8.
Front Genet ; 9: 332, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-30177953

RESUMO

Over the last two decades, various in silico approaches have been developed and refined that attempt to identify protein and/or peptide vaccines candidates from informative signals encoded in protein sequences of a target pathogen. As to date, no signal has been identified that clearly indicates a protein will effectively contribute to a protective immune response in a host. The premise for this study is that proteins under positive selection from the immune system are more likely suitable vaccine candidates than proteins exposed to other selection pressures. Furthermore, our expectation is that protein sequence regions encoding major histocompatibility complexes (MHC) binding peptides will contain consecutive positive selection sites. Using freely available data and bioinformatic tools, we present a high-throughput approach through a pipeline that predicts positive selection sites, protein subcellular locations, and sequence locations of medium to high T-Cell MHC class I binding peptides. Positive selection sites are estimated from a sequence alignment by comparing rates of synonymous (dS) and non-synonymous (dN) substitutions among protein coding sequences of orthologous genes in a phylogeny. The main pipeline output is a list of protein vaccine candidates predicted to be naturally exposed to the immune system and containing sites under positive selection. Candidates are ranked with respect to the number of consecutive sites located on protein sequence regions encoding MHCI-binding peptides. Results are constrained by the reliability of prediction programs and quality of input data. Protein sequences from Toxoplasma gondii ME49 strain (TGME49) were used as a case study. Surface antigen (SAG), dense granules (GRA), microneme (MIC), and rhoptry (ROP) proteins are considered worthy T. gondii candidates. Given 8263 TGME49 protein sequences processed anonymously, the top 10 predicted candidates were all worthy candidates. In particular, the top ten included ROP5 and ROP18, which are T. gondii virulence determinants. The chance of randomly selecting a ROP protein was 0.2% given 8263 sequences. We conclude that the approach described is a valuable addition to other in silico approaches to identify vaccines candidates worthy of laboratory validation and could be adapted for other apicomplexan parasite species (with appropriate data).

9.
Int J Parasitol ; 47(12): 779-790, 2017 10.
Artigo em Inglês | MEDLINE | ID: mdl-28893639

RESUMO

Reverse vaccinology has the potential to rapidly advance vaccine development against parasites, but it is unclear which features studied in silico will advance vaccine development. Here we consider Neospora caninum which is a globally distributed protozoan parasite causing significant economic and reproductive loss to cattle industries worldwide. The aim of this study was to use a reverse vaccinology approach to compile a worthy vaccine candidate list for N. caninum, including proteins containing pathogen-associated molecular patterns to act as vaccine carriers. The in silico approach essentially involved collecting a wide range of gene and protein features from public databases or computationally predicting those for every known Neospora protein. This data collection was then analysed using an automated high-throughput process to identify candidates. The final vaccine list compiled was judged to be the optimum within the constraints of available data, current knowledge, and existing bioinformatics programs. We consider and provide some suggestions and experience on how ranking of vaccine candidate lists can be performed. This study is therefore important in that it provides a valuable resource for establishing new directions in vaccine research against neosporosis and other parasitic diseases of economic and medical importance.


Assuntos
Antígenos de Protozoários/imunologia , Neospora/imunologia , Proteínas de Protozoários/imunologia , Vacinas Protozoárias/imunologia , Animais , Antígenos de Protozoários/classificação , Antígenos de Protozoários/genética , Sequência de Bases , Bovinos , Éxons , Etiquetas de Sequências Expressas , Concentração Inibidora 50 , Anotação de Sequência Molecular , Neospora/genética , Oligopeptídeos/química , Oligopeptídeos/metabolismo , Proteínas de Protozoários/classificação , Proteínas de Protozoários/genética , Vacinas Protozoárias/classificação , Vacinas Protozoárias/genética , RNA Mensageiro/química
10.
Int J Parasitol ; 45(5): 305-18, 2015 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-25747726

RESUMO

Neospora caninum is an apicomplexan parasite which can cause abortion in cattle, instigating major economic burden. Vaccination has been proposed as the most cost-effective control measure to alleviate this burden. Consequently the overriding aspiration for N. caninum research is the identification and subsequent evaluation of vaccine candidates in animal models. To save time, cost and effort, it is now feasible to use an in silico approach for vaccine candidate prediction. Precise protein sequences, derived from the correct open reading frame, are paramount and arguably the most important factor determining the success or failure of this approach. The challenge is that publicly available N. caninum sequences are mostly derived from gene predictions. Annotated inaccuracies can lead to erroneously predicted vaccine candidates by bioinformatics programs. This study evaluates the current N. caninum annotation for potential inaccuracies. Comparisons with annotation from a closely related pathogen, Toxoplasma gondii, are also made to distinguish patterns of inconsistency. More importantly, a mRNA sequencing (RNA-Seq) experiment is used to validate the annotation. Potential discrepancies originating from a questionable start codon context and exon boundaries were identified in 1943 protein coding sequences. We conclude, where experimental data were available, that the majority of N. caninum gene sequences were reliably predicted. Nevertheless, almost 28% of genes were identified as questionable. Given the limitations of RNA-Seq, the intention of this study was not to replace the existing annotation but to support or oppose particular aspects of it. Ideally, many studies aimed at improving the annotation are required to build a consensus. We believe this study, in providing a new resource on gene structure and annotation, is a worthy contributor to this endeavour.


Assuntos
Doenças dos Bovinos/parasitologia , Coccidiose/veterinária , Neospora/genética , Proteínas de Protozoários/genética , Vacinas Protozoárias/genética , Animais , Bovinos , Doenças dos Bovinos/prevenção & controle , Coccidiose/parasitologia , Coccidiose/prevenção & controle , Simulação por Computador , Anotação de Sequência Molecular , Neospora/imunologia , Proteínas de Protozoários/imunologia , Vacinas Protozoárias/imunologia
11.
PLoS One ; 9(12): e115745, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25545691

RESUMO

Given thousands of proteins constituting a eukaryotic pathogen, the principal objective for a high-throughput in silico vaccine discovery pipeline is to select those proteins worthy of laboratory validation. Accurate prediction of T-cell epitopes on protein antigens is one crucial piece of evidence that would aid in this selection. Prediction of peptides recognised by T-cell receptors have to date proved to be of insufficient accuracy. The in silico approach is consequently reliant on an indirect method, which involves the prediction of peptides binding to major histocompatibility complex (MHC) molecules. There is no guarantee nevertheless that predicted peptide-MHC complexes will be presented by antigen-presenting cells and/or recognised by cognate T-cell receptors. The aim of this study was to determine if predicted peptide-MHC binding scores could provide contributing evidence to establish a protein's potential as a vaccine. Using T-Cell MHC class I binding prediction tools provided by the Immune Epitope Database and Analysis Resource, peptide binding affinity to 76 common MHC I alleles were predicted for 160 Toxoplasma gondii proteins: 75 taken from published studies represented proteins known or expected to induce T-cell immune responses and 85 considered less likely vaccine candidates. The results show there is no universal set of rules that can be applied directly to binding scores to distinguish a vaccine from a non-vaccine candidate. We present, however, two proposed strategies exploiting binding scores that provide supporting evidence that a protein is likely to induce a T-cell immune response-one using random forest (a machine learning algorithm) with a 72% sensitivity and 82.4% specificity and the other, using amino acid conservation scores with a 74.6% sensitivity and 70.5% specificity when applied to the 160 benchmark proteins. More importantly, the binding score strategies are valuable evidence contributors to the overall in silico vaccine discovery pool of evidence.


Assuntos
Genes MHC Classe I/imunologia , Peptídeos/metabolismo , Ligação Proteica/imunologia , Proteínas/metabolismo , Vacinas Protozoárias , Algoritmos , Aminoácidos/química , Aminoácidos/classificação , Inteligência Artificial , Biologia Computacional , Simulação por Computador , Bases de Dados de Proteínas , Epitopos de Linfócito T/imunologia , Humanos , Peptídeos/química , Peptídeos/imunologia , Proteínas/imunologia , Linfócitos T/imunologia , Linfócitos T/parasitologia , Toxoplasma
12.
Trends Parasitol ; 30(8): 401-11, 2014 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-25028089

RESUMO

A vaccine is urgently needed to prevent cattle neosporosis. This infectious disease is caused by the parasite Neospora caninum, a complex biological system with multifaceted life cycles. An in silico vaccine discovery approach attempts to transform digital abstractions of this system into adequate knowledge to predict candidates. Researchers need current information to implement such an approach, such as understanding evasion mechanisms of the immune system, type of immune response to elicit, availability of data and prediction programs, and statistical models to analyze predictions. Taken together, an in silico approach involves assembly of an intricate jigsaw of interdisciplinary and interdependent knowledge. In this review, we focus on the approach influencing vaccine development against Neospora caninum, which can be generalized to other pathogenic apicomplexans.


Assuntos
Doenças dos Bovinos/prevenção & controle , Coccidiose/veterinária , Simulação por Computador , Vacinas Protozoárias , Animais , Antígenos de Protozoários/genética , Bovinos , Coccidiose/prevenção & controle , Computadores , Neospora
13.
Bioinformatics ; 30(16): 2381-3, 2014 Aug 15.
Artigo em Inglês | MEDLINE | ID: mdl-24790156

RESUMO

UNLABELLED: We present Vacceed, a highly configurable and scalable framework designed to automate the process of high-throughput in silico vaccine candidate discovery for eukaryotic pathogens. Given thousands of protein sequences from the target pathogen as input, the main output is a ranked list of protein candidates determined by a set of machine learning algorithms. Vacceed has the potential to save time and money by reducing the number of false candidates allocated for laboratory validation. Vacceed, if required, can also predict protein sequences from the pathogen's genome. AVAILABILITY AND IMPLEMENTATION: Vacceed is tested on Linux and can be freely downloaded from https://github.com/sgoodswe/vacceed/releases (includes a worked example with sample data). Vacceed User Guide can be obtained from https://github.com/sgoodswe/vacceed.


Assuntos
Software , Vacinas/química , Algoritmos , Inteligência Artificial , Simulação por Computador , Análise de Sequência de Proteína , Vacinas/genética
14.
BMC Bioinformatics ; 14: 315, 2013 Nov 02.
Artigo em Inglês | MEDLINE | ID: mdl-24180526

RESUMO

BACKGROUND: An in silico vaccine discovery pipeline for eukaryotic pathogens typically consists of several computational tools to predict protein characteristics. The aim of the in silico approach to discovering subunit vaccines is to use predicted characteristics to identify proteins which are worthy of laboratory investigation. A major challenge is that these predictions are inherent with hidden inaccuracies and contradictions. This study focuses on how to reduce the number of false candidates using machine learning algorithms rather than relying on expensive laboratory validation. Proteins from Toxoplasma gondii, Plasmodium sp., and Caenorhabditis elegans were used as training and test datasets. RESULTS: The results show that machine learning algorithms can effectively distinguish expected true from expected false vaccine candidates (with an average sensitivity and specificity of 0.97 and 0.98 respectively), for proteins observed to induce immune responses experimentally. CONCLUSIONS: Vaccine candidates from an in silico approach can only be truly validated in a laboratory. Given any in silico output and appropriate training data, the number of false candidates allocated for validation can be dramatically reduced using a pool of machine learning algorithms. This will ultimately save time and money in the laboratory.


Assuntos
Antígenos/imunologia , Proteínas de Caenorhabditis elegans/imunologia , Biologia Computacional/métodos , Simulação por Computador , Proteínas de Protozoários/imunologia , Vacinas/imunologia , Algoritmos , Animais , Antígenos/química , Inteligência Artificial , Proteínas de Caenorhabditis elegans/química , Descoberta de Drogas , Proteínas de Protozoários/química , Sensibilidade e Especificidade , Vacinas/química
15.
Infect Genet Evol ; 13: 133-50, 2013 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-22985682

RESUMO

This paper is a review of current knowledge on Neospora caninum in the context of other apicomplexan parasites and with an emphasis on: life cycle, disease, epidemiology, immunity, control and treatment, evolution, genomes, and biological databases and web resources. N. caninum is an obligate, intracellular, coccidian, protozoan parasite of the phylum Apicomplexa. Infection can cause the clinical disease neosporosis, which most notably is associated with abortion in cattle. These abortions are a major root cause of economic loss to both the dairy and beef industries worldwide. N. caninum has been detected in every country in which a study has been specifically conducted to detect this parasite in cattle. The major mode of transmission in cattle is transplacental (or vertical) transmission and several elements of the N. caninum life cycle are yet to be studied in detail. The outcome of an infection is inextricably linked to the precise timing of the infection coupled with the status of the immune system of the dam and foetus. There is no community consensus as to whether it is the dam's pro-inflammatory cytotoxic response to tachyzoites that kills the foetus or the tachyzoites themselves. From economic analysis the most cost-effective approach to control neosporosis is a vaccine. The perfect vaccine would protect against both infection and the clinical disease, and this implies a vaccine is needed that can induce a non-foetopathic cell mediated immunity response. Researchers are beginning to capitalise on the vast potential of -omics data (e.g. genomes, transcriptomes, and proteomes) to further our understanding of pathogens but especially to identify vaccine and drug targets. The recent publication of a genome for N. caninum offers vast opportunities in these areas.


Assuntos
Coccidiose/parasitologia , Neospora/genética , Animais , Evolução Biológica , Coccidiose/epidemiologia , Coccidiose/história , Coccidiose/prevenção & controle , Bases de Dados Factuais , Genes de Protozoários , Genoma de Protozoário , História do Século XX , História do Século XXI , Internet , Estágios do Ciclo de Vida , Neospora/crescimento & desenvolvimento , Neospora/imunologia
16.
Brief Bioinform ; 14(6): 753-74, 2013 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-23097412

RESUMO

In this article, a framework for an in silico pipeline is presented as a guide to high-throughput vaccine candidate discovery for eukaryotic pathogens, such as helminths and protozoa. Eukaryotic pathogens are mostly parasitic and cause some of the most damaging and difficult to treat diseases in humans and livestock. Consequently, these parasitic pathogens have a significant impact on economy and human health. The pipeline is based on the principle of reverse vaccinology and is constructed from freely available bioinformatics programs. There are several successful applications of reverse vaccinology to the discovery of subunit vaccines against prokaryotic pathogens but not yet against eukaryotic pathogens. The overriding aim of the pipeline, which focuses on eukaryotic pathogens, is to generate through computational processes of elimination and evidence gathering a ranked list of proteins based on a scoring system. These proteins are either surface components of the target pathogen or are secreted by the pathogen and are of a type known to be antigenic. No perfect predictive method is yet available; therefore, the highest-scoring proteins from the list require laboratory validation.


Assuntos
Células Eucarióticas/imunologia , Vacinas , Simulação por Computador
17.
PLoS One ; 7(11): e50609, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-23226328

RESUMO

Next generation sequencing technology is advancing genome sequencing at an unprecedented level. By unravelling the code within a pathogen's genome, every possible protein (prior to post-translational modifications) can theoretically be discovered, irrespective of life cycle stages and environmental stimuli. Now more than ever there is a great need for high-throughput ab initio gene finding. Ab initio gene finders use statistical models to predict genes and their exon-intron structures from the genome sequence alone. This paper evaluates whether existing ab initio gene finders can effectively predict genes to deduce proteins that have presently missed capture by laboratory techniques. An aim here is to identify possible patterns of prediction inaccuracies for gene finders as a whole irrespective of the target pathogen. All currently available ab initio gene finders are considered in the evaluation but only four fulfil high-throughput capability: AUGUSTUS, GeneMark_hmm, GlimmerHMM, and SNAP. These gene finders require training data specific to a target pathogen and consequently the evaluation results are inextricably linked to the availability and quality of the data. The pathogen, Toxoplasma gondii, is used to illustrate the evaluation methods. The results support current opinion that predicted exons by ab initio gene finders are inaccurate in the absence of experimental evidence. However, the results reveal some patterns of inaccuracy that are common to all gene finders and these inaccuracies may provide a focus area for future gene finder developers.


Assuntos
Genes de Protozoários/genética , Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Laboratórios , Proteínas de Protozoários/genética , Toxoplasma/genética , Sequência de Bases , Reprodutibilidade dos Testes
18.
BMC Bioinformatics ; 11: 311, 2010 Jun 09.
Artigo em Inglês | MEDLINE | ID: mdl-20534127

RESUMO

BACKGROUND: Whole genome association studies using highly dense single nucleotide polymorphisms (SNPs) are a set of methods to identify DNA markers associated with variation in a particular complex trait of interest. One of the main outcomes from these studies is a subset of statistically significant SNPs. Finding the potential biological functions of such SNPs can be an important step towards further use in human and agricultural populations (e.g., for identifying genes related to susceptibility to complex diseases or genes playing key roles in development or performance). The current challenge is that the information holding the clues to SNP functions is distributed across many different databases. Efficient bioinformatics tools are therefore needed to seamlessly integrate up-to-date functional information on SNPs. Many web services have arisen to meet the challenge but most work only within the framework of human medical research. Although we acknowledge the importance of human research, we identify there is a need for SNP annotation tools for other organisms. DESCRIPTION: We introduce an R package called FunctSNP, which is the user interface to custom built species-specific databases. The local relational databases contain SNP data together with functional annotations extracted from online resources. FunctSNP provides a unified bioinformatics resource to link SNPs with functional knowledge (e.g., genes, pathways, ontologies). We also introduce dbAutoMaker, a suite of Perl scripts, which can be scheduled to run periodically to automatically create/update the customised SNP databases. We illustrate the use of FunctSNP with a livestock example, but the approach and software tools presented here can be applied also to human and other organisms. CONCLUSIONS: Finding the potential functional significance of SNPs is important when further using the outcomes from whole genome association studies. FunctSNP is unique in that it is the only R package that links SNPs to functional annotation. FunctSNP interfaces to local SNP customised databases which can be built for any species contained in the National Center for Biotechnology Information dbSNP database.


Assuntos
Genômica/métodos , Polimorfismo de Nucleotídeo Único/genética , Software , Bases de Dados Genéticas , Marcadores Genéticos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...